Statistical Disclosure Control

Software tools for tabular data (WP 4.2)

Leading partner: CBS

Participating partner: CBS

Objectives

This consortium plans to take over the further development of τ-ARGUS resulting from the SDC-project. The main remaining wishes for extension of τ-ARGUS lie in the complexity of the tables to be protected. The current version of τ-ARGUS will only protect tables up to 3 dimensions without hierarchies. This has proven to be a too large restriction. Both linked tables and hierarchical tables are very common structures in the tables produced at a statistical office and others. However these more complex tables increase dramatically the complexity of the problems to be solved. Finding the optimal solution for the secondary cell suppression will be a very hard job. This is investigated in WP4.1 and will be implemented in τ-ARGUS.. Both this overall optimal but computationally very heavy approach several heuristics for finding good (maybe near optimal) solution will be investigated and implemented in τ-ARGUS. Different input formats for micro-data will be allowed (SAS, Oracle) besides ASCII as well as ready made tables. For the manipulation of hierarchical tables new data-structures need to be implemented. The testing of the software is foreseen in WP6.

Description of work

Task 1 Migration from Borland C++ to Visual C++, Input of ready made tables, Hierarchical structures; Top-down hierarchical search. Provide a COM-interface.
Task 2 Reading SAS and Oracle input files using OLE-DB-drivers and SQL-server, interface to the GHQUAR-hypercube method, Linked tables (WP 3)
Task 3 Optimal search library (WP4.1 (ULL)), Network-flow models (WP 4.1, UPC) implement the final test results

Milestones and expected result

The extended version of τ-ARGUS will contain several new approaches for secondary cell suppression. τ-ARGUS will allow for the much required more complex data-structures like hierarchical and linked tables. The advantages will be that the user can apply different methods depending on the requirements of that moment and also can compare the different results. We expect that the software will be widely used as a standard tool for the disclosure control of tabular data. The participation of many European partners both as developers and as testers guarantee this. Besides this the results of this development can ideally be used in (TES)-courses on Statistical Disclosure Control.